Á¤º¸°úÇÐȸ ÄÄÇ»ÆÃÀÇ ½ÇÁ¦ ³í¹®Áö (KIISE Transactions on Computing Practices)
ÇѱÛÁ¦¸ñ(Korean Title) |
ÇÏµÓ ¿¡ÄڽýºÅÛÀ» È°¿ëÇÑ ·Î±× µ¥ÀÌÅÍÀÇ ÀÌ»ó ŽÁö ±â¹ý |
¿µ¹®Á¦¸ñ(English Title) |
Anomaly Detection Technique of Log Data Using Hadoop Ecosystem |
ÀúÀÚ(Author) |
¼Õ½Ã¿î
±æ¸í¼±
¹®¾ç¼¼
Siwoon Son
Myeong-Seon Gil
Yang-Sae Moon
|
¿ø¹®¼ö·Ïó(Citation) |
VOL 23 NO. 02 PP. 0128 ~ 0133 (2017. 02) |
Çѱ۳»¿ë (Korean Abstract) |
ÃÖ±Ù ´ë¿ë·® µ¥ÀÌÅÍ ºÐ¼®À» À§ÇØ ´Ù¼öÀÇ ¼¹ö¸¦ »ç¿ëÇÏ´Â ½Ã½ºÅÛÀÌ Áõ°¡ÇÏ°í ÀÖ´Ù. ´ëÇ¥ÀûÀÎ ºòµ¥ÀÌÅÍ ±â¼úÀÎ ÇϵÓÀº ´ë¿ë·® µ¥ÀÌÅ͸¦ ´Ù¼öÀÇ ¼¹ö·Î ±¸¼ºµÈ ºÐ»ê ȯ°æ¿¡ ÀúÀåÇÏ¿© ó¸®ÇÑ´Ù. ÀÌ·¯ÇÑ ºÐ»ê ½Ã½ºÅÛ¿¡¼´Â °¢ ¼¹öÀÇ ½Ã½ºÅÛ ÀÚ¿ø °ü¸®°¡ ¸Å¿ì Áß¿äÇÏ´Ù. º» ³í¹®Àº ´Ù¼öÀÇ ¼¹ö¿¡¼ ¼öÁýµÈ ·Î±× µ¥ÀÌÅ͸¦ Åä´ë·Î °£´ÜÇÏ¸é¼ È¿À²ÀûÀÎ ÀÌ»ó ŽÁö ±â¹ýÀ» »ç¿ëÇÏ¿© ·Î±× µ¥ÀÌÅÍÀÇ º¯È°¡ ±ÞÁõÇÏ´Â ÀÌ»óÄ¡¸¦ ŽÁöÇÏ°íÀÚ ÇÑ´Ù. À̸¦ À§ÇØ, °¢ ¼¹ö·ÎºÎÅÍ ·Î±× µ¥ÀÌÅ͸¦ ¼öÁýÇÏ¿© ÇÏµÓ ¿¡ÄڽýºÅÛ¿¡ ÀúÀåÇÒ ¼ö ÀÖµµ·Ï Apache HiveÀÇ ÀúÀå ±¸Á¶¸¦ ¼³°èÇÏ°í, À̵¿ Æò±Õ ¹× 3-½Ã±×¸¶¸¦ »ç¿ëÇÑ ¼¼ °¡Áö ÀÌ»ó ŽÁö ±â¹ýÀ» ¼³°èÇÑ´Ù. ¸¶Áö¸·À¸·Î ½ÇÇèÀ» ÅëÇØ ¼¼ °¡Áö ±â¹ýÀÌ ¸ðµÎ ¿Ã¹Ù·Î ÀÌ»ó ±¸°£À» ŽÁöÇϸç, ¶ÇÇÑ °¡ÁßÄ¡°¡ Àû¿ëµÈ ÀÌ»ó ŽÁö ±â¹ýÀÌ Áߺ¹À» Á¦°ÅÇÑ ´õ Á¤È®ÇÑ Å½Áö ±â¹ýÀÓÀ» È®ÀÎÇÑ´Ù. º» ³í¹®Àº ÇÏµÓ ¿¡ÄڽýºÅÛÀ» »ç¿ëÇÏ¿© °£´ÜÇÑ ¹æ¹ýÀ¸·Î ·Î±× µ¥ÀÌÅÍÀÇ ÀÌ»óÀ» ŽÁöÇÏ´Â ¿ì¼öÇÑ °á°ú¶ó »ç·áµÈ´Ù.
|
¿µ¹®³»¿ë (English Abstract) |
In recent years, the number of systems for the analysis of large volumes of data is increasing. Hadoop, a representative big data system, stores and processes the large data in the distributed environment of multiple servers, where system-resource management is very important. The authors attempted to detect anomalies from the rapid changing of the log data that are collected from the multiple servers using simple but efficient anomaly-detection techniques. Accordingly, an Apache Hive storage architecture was designed to store the log data that were collected from the multiple servers in the Hadoop ecosystem. Also, three anomaly-detection techniques were designed based on the moving-average and 3-sigma concepts. It was finally confirmed that all three of the techniques detected the abnormal intervals correctly, while the weighted anomaly-detection technique is more precise than the basic techniques. These results show an excellent approach for the detection of log-data anomalies with the use of simple techniques in the Hadoop ecosystem.
|
Å°¿öµå(Keyword) |
ºòµ¥ÀÌÅÍ
¾ÆÆÄÄ¡ ÇϵÓ
¾ÆÆÄÄ¡ ÇÏÀ̺ê
·Î±× µ¥ÀÌÅÍ
ÀÌ»ó ŽÁö
Big Data
Apache Hadoop
Apache Hive
log data
anomaly detection
|
ÆÄÀÏ÷ºÎ |
PDF ´Ù¿î·Îµå
|